Automatic Morphological Generation for the Indexing of Arabic Speech Recordings

نویسندگان

  • Khaled F. Shaalan
  • Habib Talhami
  • Ibrahim Kamel
چکیده

This paper presents a novel Arabic morphological generator (AMG) for Modern Standard Arabic (MSA) which is designed and implemented using Prolog. The AMG is used to generate inflected forms of words used for the indexing of Arabic audio. These words are also the relevant terms in the Arab authority system (library information retrieval system) used in this study. The AMG generates inflected Arabic words from the root according to pre-specified morphological features that can be extended as needed. The Arabic word is represented as a feature structure which is handled through unification during the morphological generation process. The inflected forms can then be inserted automatically into a speech recognition grammar which is used to identify these words in an audio sequence or utterance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Off-line Arabic Handwritten Recognition Using a Novel Hybrid HMM-DNN Model

In order to facilitate the entry of data into the computer and its digitalization, automatic recognition of printed texts and manuscripts is one of the considerable aid to many applications. Research on automatic document recognition started decades ago with the recognition of isolated digits and letters, and today, due to advancements in machine learning methods, efforts are being made to iden...

متن کامل

Speech Retrieval Based on Automatic Indexing

We present a system that retrieves audio recordings containing spoken text in response to a given textual query. In particular, we describe indexing methods that automatically describe the content of the recordings. The indexing methods, which are based on phoneme recognition output, take account of speech recognition errors. Additionally, the indexing methods we present are suitable for a lang...

متن کامل

Using Machine Learning Algorithms for Automatic Cyber Bullying Detection in Arabic Social Media

Social media allows people interact to express their thoughts or feelings about different subjects. However, some of users may write offensive twits to other via social media which known as cyber bullying. Successful prevention depends on automatically detecting malicious messages. Automatic detection of bullying in the text of social media by analyzing the text "twits" via one of the machine l...

متن کامل

مدل دو مرحله ای شکاف- گلچین برای نمایه سازی خودکار متون فارسی

Purpose: Each language has its own problems. This leads to consider appropriate models for automatic indexing of every language. These models should concern the exhaustificity and specificity of indexing.   This paper aims at introduction and evaluation of a model which is suited for Persian automatic indexing. This model suggests to break the text into the particles of candidate terms and to c...

متن کامل

Automatic Speech Recognition Texts Clustering

Abstract. This paper deals with the clustering task for Russian texts obtained using automatic speech recognition (ASR). The input for processing are recognition result for phone call recordings and manual text transcripts for these calls. We present a comparative analysis of clustering results for recognition texts and manual text transcripts, make an evaluation of how recognition quality affe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Int. J. Comput. Proc. Oriental Lang.

دوره 20  شماره 

صفحات  -

تاریخ انتشار 2007